Clustering of reads with alignment-free measures and quality values
Identifieur interne : 001861 ( Main/Exploration ); précédent : 001860; suivant : 001862Clustering of reads with alignment-free measures and quality values
Auteurs : Matteo Comin [Italie] ; Andrea Leoni [Italie] ; Michele Schimd [Italie]Source :
- Algorithms for Molecular Biology : AMB [ 1748-7188 ] ; 2015.
Abstract
The data volume generated by Next-Generation Sequencing (NGS) technologies is growing at a pace that is now challenging the storage and data processing capacities of modern computer systems. In this context an important aspect is the reduction of data complexity by collapsing redundant reads in a single cluster to improve the run time, memory requirements, and quality of post-processing steps like assembly and error correction. Several alignment-free measures, based on
Quality scores produced by NGS platforms are fundamental for various analysis of NGS data like reads mapping and error detection. Moreover future-generation sequencing platforms will produce long reads but with a large number of erroneous bases (up to 15
In this scenario it will be fundamental to exploit quality value information within the alignment-free framework. To the best of our knowledge this is the first study that incorporates quality value information and
Url:
DOI: 10.1186/s13015-014-0029-x
PubMed: 25691913
PubMed Central: 4331138
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 000244
- to stream Pmc, to step Curation: 000244
- to stream Pmc, to step Checkpoint: 000E93
- to stream PubMed, to step Corpus: 001696
- to stream PubMed, to step Curation: 001696
- to stream PubMed, to step Checkpoint: 001608
- to stream Ncbi, to step Merge: 001045
- to stream Ncbi, to step Curation: 001045
- to stream Ncbi, to step Checkpoint: 001045
- to stream Main, to step Merge: 001866
- to stream Main, to step Curation: 001861
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Clustering of reads with alignment-free measures and quality values</title>
<author><name sortKey="Comin, Matteo" sort="Comin, Matteo" uniqKey="Comin M" first="Matteo" last="Comin">Matteo Comin</name>
<affiliation wicri:level="1"><nlm:aff id="Aff1">Department of Information Engineering, University of Padova, Padova, Italy</nlm:aff>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>Department of Information Engineering, University of Padova, Padova</wicri:regionArea>
<wicri:noRegion>Padova</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Leoni, Andrea" sort="Leoni, Andrea" uniqKey="Leoni A" first="Andrea" last="Leoni">Andrea Leoni</name>
<affiliation wicri:level="1"><nlm:aff id="Aff1">Department of Information Engineering, University of Padova, Padova, Italy</nlm:aff>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>Department of Information Engineering, University of Padova, Padova</wicri:regionArea>
<wicri:noRegion>Padova</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Schimd, Michele" sort="Schimd, Michele" uniqKey="Schimd M" first="Michele" last="Schimd">Michele Schimd</name>
<affiliation wicri:level="1"><nlm:aff id="Aff1">Department of Information Engineering, University of Padova, Padova, Italy</nlm:aff>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>Department of Information Engineering, University of Padova, Padova</wicri:regionArea>
<wicri:noRegion>Padova</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">25691913</idno>
<idno type="pmc">4331138</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4331138</idno>
<idno type="RBID">PMC:4331138</idno>
<idno type="doi">10.1186/s13015-014-0029-x</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000244</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000244</idno>
<idno type="wicri:Area/Pmc/Curation">000244</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000244</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000E93</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000E93</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:25691913</idno>
<idno type="wicri:Area/PubMed/Corpus">001696</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001696</idno>
<idno type="wicri:Area/PubMed/Curation">001696</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001696</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001608</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001608</idno>
<idno type="wicri:Area/Ncbi/Merge">001045</idno>
<idno type="wicri:Area/Ncbi/Curation">001045</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">001045</idno>
<idno type="wicri:Area/Main/Merge">001866</idno>
<idno type="wicri:Area/Main/Curation">001861</idno>
<idno type="wicri:Area/Main/Exploration">001861</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Clustering of reads with alignment-free measures and quality values</title>
<author><name sortKey="Comin, Matteo" sort="Comin, Matteo" uniqKey="Comin M" first="Matteo" last="Comin">Matteo Comin</name>
<affiliation wicri:level="1"><nlm:aff id="Aff1">Department of Information Engineering, University of Padova, Padova, Italy</nlm:aff>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>Department of Information Engineering, University of Padova, Padova</wicri:regionArea>
<wicri:noRegion>Padova</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Leoni, Andrea" sort="Leoni, Andrea" uniqKey="Leoni A" first="Andrea" last="Leoni">Andrea Leoni</name>
<affiliation wicri:level="1"><nlm:aff id="Aff1">Department of Information Engineering, University of Padova, Padova, Italy</nlm:aff>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>Department of Information Engineering, University of Padova, Padova</wicri:regionArea>
<wicri:noRegion>Padova</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Schimd, Michele" sort="Schimd, Michele" uniqKey="Schimd M" first="Michele" last="Schimd">Michele Schimd</name>
<affiliation wicri:level="1"><nlm:aff id="Aff1">Department of Information Engineering, University of Padova, Padova, Italy</nlm:aff>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>Department of Information Engineering, University of Padova, Padova</wicri:regionArea>
<wicri:noRegion>Padova</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j">Algorithms for Molecular Biology : AMB</title>
<idno type="eISSN">1748-7188</idno>
<imprint><date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><sec><title>Background</title>
<p>The data volume generated by Next-Generation Sequencing (NGS) technologies is growing at a pace that is now challenging the storage and data processing capacities of modern computer systems. In this context an important aspect is the reduction of data complexity by collapsing redundant reads in a single cluster to improve the run time, memory requirements, and quality of post-processing steps like assembly and error correction. Several alignment-free measures, based on <italic>k</italic>
-mers counts, have been used to cluster reads.</p>
<p>Quality scores produced by NGS platforms are fundamental for various analysis of NGS data like reads mapping and error detection. Moreover future-generation sequencing platforms will produce long reads but with a large number of erroneous bases (up to 15 <italic>%</italic>
).</p>
</sec>
<sec><title>Results</title>
<p>In this scenario it will be fundamental to exploit quality value information within the alignment-free framework. To the best of our knowledge this is the first study that incorporates quality value information and <italic>k</italic>
-mers counts, in the context of alignment-free measures, for the comparison of reads data. Based on this principles, in this paper we present a family of alignment-free measures called <italic>D</italic>
<sup><italic>q</italic>
</sup>
-type. A set of experiments on simulated and real reads data confirms that the new measures are superior to other classical alignment-free statistics, especially when erroneous reads are considered. Also results on <italic>de novo</italic>
assembly and metagenomic reads classification show that the introduction of quality values improves over standard alignment-free measures. These statistics are implemented in a software called QCluster (http://www.dei.unipd.it/~ciompin/main/qcluster.html).</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Medini, D" uniqKey="Medini D">D Medini</name>
</author>
<author><name sortKey="Serruto, D" uniqKey="Serruto D">D Serruto</name>
</author>
<author><name sortKey="Parkhill, J" uniqKey="Parkhill J">J Parkhill</name>
</author>
<author><name sortKey="Relman, Da" uniqKey="Relman D">DA Relman</name>
</author>
<author><name sortKey="Donati, C" uniqKey="Donati C">C Donati</name>
</author>
<author><name sortKey="Moxon, R" uniqKey="Moxon R">R Moxon</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Jothi, R" uniqKey="Jothi R">R Jothi</name>
</author>
<author><name sortKey="Cuddapah, S" uniqKey="Cuddapah S">S Cuddapah</name>
</author>
<author><name sortKey="Barski, A" uniqKey="Barski A">A Barski</name>
</author>
<author><name sortKey="Cui, K" uniqKey="Cui K">K Cui</name>
</author>
<author><name sortKey="Zhao, K" uniqKey="Zhao K">K Zhao</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author><name sortKey="Gish, W" uniqKey="Gish W">W Gish</name>
</author>
<author><name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author><name sortKey="Myers, Ew" uniqKey="Myers E">EW Myers</name>
</author>
<author><name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sims, Ge" uniqKey="Sims G">GE Sims</name>
</author>
<author><name sortKey="Jun, S R" uniqKey="Jun S">S-R Jun</name>
</author>
<author><name sortKey="Wu, Ga" uniqKey="Wu G">GA Wu</name>
</author>
<author><name sortKey="Kim, S H" uniqKey="Kim S">S-H Kim</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Comin, M" uniqKey="Comin M">M Comin</name>
</author>
<author><name sortKey="Verzotto, D" uniqKey="Verzotto D">D Verzotto</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Song, K" uniqKey="Song K">K Song</name>
</author>
<author><name sortKey="Ren, J" uniqKey="Ren J">J Ren</name>
</author>
<author><name sortKey="Zhai, Z" uniqKey="Zhai Z">Z Zhai</name>
</author>
<author><name sortKey="Liu, X" uniqKey="Liu X">X Liu</name>
</author>
<author><name sortKey="Deng, M" uniqKey="Deng M">M Deng</name>
</author>
<author><name sortKey="Sun, F" uniqKey="Sun F">F Sun</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Comin, M" uniqKey="Comin M">M Comin</name>
</author>
<author><name sortKey="Schimd, M" uniqKey="Schimd M">M Schimd</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Vinga, S" uniqKey="Vinga S">S Vinga</name>
</author>
<author><name sortKey="Almeida, J" uniqKey="Almeida J">J Almeida</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dai, Q" uniqKey="Dai Q">Q Dai</name>
</author>
<author><name sortKey="Wang, T" uniqKey="Wang T">T Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gao, L" uniqKey="Gao L">L Gao</name>
</author>
<author><name sortKey="Qi, J" uniqKey="Qi J">J Qi</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Qi, J" uniqKey="Qi J">J Qi</name>
</author>
<author><name sortKey="Luo, H" uniqKey="Luo H">H Luo</name>
</author>
<author><name sortKey="Hao, B" uniqKey="Hao B">B Hao</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Goke, J" uniqKey="Goke J">J Göke</name>
</author>
<author><name sortKey="Schulz, Mh" uniqKey="Schulz M">MH Schulz</name>
</author>
<author><name sortKey="Lasserre, J" uniqKey="Lasserre J">J Lasserre</name>
</author>
<author><name sortKey="Vingron, M" uniqKey="Vingron M">M Vingron</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kantorovitz, Mr" uniqKey="Kantorovitz M">MR Kantorovitz</name>
</author>
<author><name sortKey="Robinson, Ge" uniqKey="Robinson G">GE Robinson</name>
</author>
<author><name sortKey="Sinha, S" uniqKey="Sinha S">S Sinha</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Comin, M" uniqKey="Comin M">M Comin</name>
</author>
<author><name sortKey="Verzotto, D" uniqKey="Verzotto D">D Verzotto</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Comin, M" uniqKey="Comin M">M Comin</name>
</author>
<author><name sortKey="Antonello, M" uniqKey="Antonello M">M Antonello</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Comin, M" uniqKey="Comin M">M Comin</name>
</author>
<author><name sortKey="Antonello, M" uniqKey="Antonello M">M Antonello</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Comin, M" uniqKey="Comin M">M Comin</name>
</author>
<author><name sortKey="Verzotto, D" uniqKey="Verzotto D">D Verzotto</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Comin, M" uniqKey="Comin M">M Comin</name>
</author>
<author><name sortKey="Verzotto, D" uniqKey="Verzotto D">D Verzotto</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Qu, W" uniqKey="Qu W">W Qu</name>
</author>
<author><name sortKey="Hashimoto, S I" uniqKey="Hashimoto S">S-i Hashimoto</name>
</author>
<author><name sortKey="Morishita, S" uniqKey="Morishita S">S Morishita</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bao, E" uniqKey="Bao E">E Bao</name>
</author>
<author><name sortKey="Jiang, T" uniqKey="Jiang T">T Jiang</name>
</author>
<author><name sortKey="Kaloshian, I" uniqKey="Kaloshian I">I Kaloshian</name>
</author>
<author><name sortKey="Girke, T" uniqKey="Girke T">T Girke</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Solovyov, A" uniqKey="Solovyov A">A Solovyov</name>
</author>
<author><name sortKey="Lipkin, W" uniqKey="Lipkin W">W Lipkin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Li, H" uniqKey="Li H">H Li</name>
</author>
<author><name sortKey="Ruan, J" uniqKey="Ruan J">J Ruan</name>
</author>
<author><name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Albers, Ca" uniqKey="Albers C">CA Albers</name>
</author>
<author><name sortKey="Lunter, G" uniqKey="Lunter G">G Lunter</name>
</author>
<author><name sortKey="Macarthur, Dg" uniqKey="Macarthur D">DG MacArthur</name>
</author>
<author><name sortKey="Mcvean, G" uniqKey="Mcvean G">G McVean</name>
</author>
<author><name sortKey="Ouwehand, Wh" uniqKey="Ouwehand W">WH Ouwehand</name>
</author>
<author><name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Carneiro, Mo" uniqKey="Carneiro M">MO Carneiro</name>
</author>
<author><name sortKey="Russ, C" uniqKey="Russ C">C Russ</name>
</author>
<author><name sortKey="Ross, Mg" uniqKey="Ross M">MG Ross</name>
</author>
<author><name sortKey="Gabriel, Sb" uniqKey="Gabriel S">SB Gabriel</name>
</author>
<author><name sortKey="Nusbaum, C" uniqKey="Nusbaum C">C Nusbaum</name>
</author>
<author><name sortKey="Depristo, Ma" uniqKey="Depristo M">MA DePristo</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Blaisdell, Be" uniqKey="Blaisdell B">BE Blaisdell</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lippert, Ra" uniqKey="Lippert R">RA Lippert</name>
</author>
<author><name sortKey="Huang, H" uniqKey="Huang H">H Huang</name>
</author>
<author><name sortKey="Waterman, Ms" uniqKey="Waterman M">MS Waterman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Reinert, G" uniqKey="Reinert G">G Reinert</name>
</author>
<author><name sortKey="Chew, D" uniqKey="Chew D">D Chew</name>
</author>
<author><name sortKey="Sun, F" uniqKey="Sun F">F Sun</name>
</author>
<author><name sortKey="Waterman, Ms" uniqKey="Waterman M">MS Waterman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wan, L" uniqKey="Wan L">L Wan</name>
</author>
<author><name sortKey="Reinert, G" uniqKey="Reinert G">G Reinert</name>
</author>
<author><name sortKey="Sun, F" uniqKey="Sun F">F Sun</name>
</author>
<author><name sortKey="Waterman, Ms" uniqKey="Waterman M">MS Waterman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ewing, B" uniqKey="Ewing B">B Ewing</name>
</author>
<author><name sortKey="Green, P" uniqKey="Green P">P Green</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Birney, E" uniqKey="Birney E">E Birney</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zerbino, Dr" uniqKey="Zerbino D">DR Zerbino</name>
</author>
<author><name sortKey="Birney, E" uniqKey="Birney E">E Birney</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Comin, M" uniqKey="Comin M">M Comin</name>
</author>
<author><name sortKey="Leoni, A" uniqKey="Leoni A">A Leoni</name>
</author>
<author><name sortKey="Schimd, M" uniqKey="Schimd M">M Schimd</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations><list><country><li>Italie</li>
</country>
</list>
<tree><country name="Italie"><noRegion><name sortKey="Comin, Matteo" sort="Comin, Matteo" uniqKey="Comin M" first="Matteo" last="Comin">Matteo Comin</name>
</noRegion>
<name sortKey="Leoni, Andrea" sort="Leoni, Andrea" uniqKey="Leoni A" first="Andrea" last="Leoni">Andrea Leoni</name>
<name sortKey="Schimd, Michele" sort="Schimd, Michele" uniqKey="Schimd M" first="Michele" last="Schimd">Michele Schimd</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001861 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001861 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Main |étape= Exploration |type= RBID |clé= PMC:4331138 |texte= Clustering of reads with alignment-free measures and quality values }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:25691913" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |